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In this paper we introduce a new approach to study jet substructure in the center-of-mass frame 
of the jet. We demonstrate that it can be used to discriminate the boosted heavy particles from the 
QCD jets and the method is complementary to other jet substructure algorithms. Applications to 
searches for hadronically decaying VK/Z+jets and heavy resonances that decay to a WW final state 
are also discussed. 
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I. INTRODUCTION 

Many theories beyond the standard model (SM) pre- 
dict new particles with masses at the TeV scale. Some of 
these heavy resonances, such as a Z', a W, a heavy Higgs 
or fourth generation quarks, can decay to final states with 
an electroweak gauge boson, W or Z, or a top quark t. 
Because of the energy scale of these processes, the W, Z 
and t from the heavy resonance decay are highly boosted. 
Their hadronically decaying products are often so colli- 
mated that they appear as a single jet, hereafter called 
VK, Z or t jets. The presence of boosted Z and t jets 
gives us a unique experimental signature to look for new 
physics (NP) phenomena beyond the SM. 

In recent years, theoretical and experimental stud- 
ies have been performed to investigate the signature of 
boosted particles, not only including W and Z bosons 
[il] and t quarks |9l-[l7j but also a boosted light Higgs bo- 
son |l8l427| at the LHC. In these studies, the complete 
final state of the heavy particle is reconstructed as a sin- 
gle jet. The invariant mass of the reconstructed jet (m^et) 
is therefore a good indicator of its origin. It has been 
shown that by using the technique of boosted jets, one 
can often achieve comparable, and sometimes even bet- 
ter sensitivities to probe NP at the TeV scale. However, 
one experimental challenge in the application of boosted 
W, Z and t jets is the copious production of QCD jets at 
the LHC, where the QCD jets are defined as those jets 
initiated by a non-top quark or gluon. As a result, the 
jet mass alone may not provide sufficient discriminating 
power to effectively distinguish W, Z and t jets from the 
overwhelming QCD background in many analyses. In the 
last few years many techniques have been developed to 
address this issue by exploring jet substructure as an ad- 
ditional experimental handle to identify boosted heavy 
objects. 

In general, jet substructure techniques can be classified 
into two categories. The first category employs jet shape 
observables [28] to probe the energy distributions inside 
jets. The second category uses jet-grooming algorithms, 
including filtering [18], pruning [3, 4] and trimming [29]. 
They take advantage of the characteristics of the sub jets 
within a jet by reclustering the energy clusters of a jet 



with the kr or Cambridge-Aachen (CA) sequential jet 
reconstruction algorithms. So far, most jet substructure 
techniques are based on energy clusters measured in the 
lab frame. In this paper, we introduce a different ap- 
proach to study jet substructure in the center-of-mass 
frame of the jet. A similar idea has also been explored 
to search for hadronically decaying Higgs boson [27]. 

We organize this paper as follows: In Section. [Ill we 
describe the event sample we used in the study. Sec- 
tion [nil discusses the method to study jet substructure 
in the jet center-of-mass frame and its performance. Sev- 
eral example of the application of our method are given 
in Section [TVl We conclude in Section IVl 



II. EVENT SAMPLE 

We use boosted W jets, from the SM process of I^+jets 
production, as an example to illustrate our proposed jet 
substructure method. For simplicity we only consider 
the background from the SM dijet production since its 
cross section is several orders of magnitudes larger than 
the other SM processes. However, our method is generic 
and is applicable to all boosted hadronically decaying 
objects, such as the Z boson, Higgs boson, or t quark. 
In addition, we also generate events to simulate the SM 
Z+jets production and a heavy-particle X that decays 
to a WW final state. 

All the events used in this analysis are produced using 
the Pythia 6.421 event generator [30] for the pp collision at 
7 TeV center-of-mass energy. The Pythia parameters are 
set to the default ATLAS parameters tuned to describe 
expected multiple interactions. In order to simulate the 
finite resolution of the Calorimeter detector at the LHC, 
we divide the ^) plane into 0.1 x 0.1 cells. We sum 
over the energy of particles entering each cell in each 
event, except for the neutrinos and muons, and replace it 
with a massless pseudoparticle of the same energy, also 
referred as an energy cluster, pointing to the center of the 
cell. These pseudoparticles are fed into the FastJet [3l| 
package for jet reconstruction. The jets are reconstructed 
with the anti-/cT algorithm |32| with a distance parameter 
of AR = 0.6. Currently the anti-/cT jet algorithm is the 



default one used at the ATLAS and CMS experiments. 



III. JET SUBSTRUCTURE IN THE REST 
FRAME 

In this section we describe the method to study jet sub- 
structure in the center-of-mass frame of the jet in order 
to distinguish the boosted hadronicahy decaying particle 
from the QCD jets. We select jets with > 300 GeV 
and 1 7^ I < 2.5 as W jet candidates. We further re- 
quire that the W jet candidates have 40 GeV < mjet < 
140 GeV. In case there is more than one candidate in an 
event, we keep the jet candidate with the highest 
in the event. Studies using W+jets Monte Carlo (MC) 
samples show that this procedure results in the selection 
of the correct W jet signal candidate more than 90% of 
the time. 



A. Center-of-mass frame of a jet 

We define the center-of-mass frame (rest frame) of a jet 
as the frame where the four momentum of the jet is equal 
to p^^^^ = (mjet, 0, 0, 0). A jet consists of its constituent 
particles. The distribution of the constituent particles 
of a boosted W/Z, t or Higgs jet in its center-of-mass 
frame, is almost identical to those of the W/Z^ t or Higgs 
particle produced at rest. For example, in the rest frame 
of a hadronically decaying W boson, the constituent par- 
ticles look like a back-to-back di-jet event. Similarly for 
the hadronically decaying t quark, its constituent particle 
distribution has a three body decay topology in its rest 
frame. On the other hand, a QCD jet acquires its mass 
through gluon radiation and it is not a closed system. Its 
constituent particle distribution in the rest frame does 
not correspond to any physical state and is more likely 
to be random, as illustrated in Figure [H This obser- 
vation is in analogy to the one in the e+e~ T(45') 
experiments, such as BaBar and Belle. In the latter case 
the event shape is used to help disentangle the BB sig- 
nal, whose decay products have an isotropic distribution, 
from the continuum background that has a pronounced 
two-jet structure. As a result, by going to the jet rest 
frame, we can apply the knowledge of the event shape 
variables learned from e~^e~ experiments to the experi- 
ments at the LHC in order to separate the boosted heavy 
objects from the QCD jets . Furthermore, the correla- 
tion between the jet substructure and jet momentum is 
expected to be small by definition. 



B. Shape variables 



We introduce five shape variables that are commonly 
used at the e+e~ experiments [s^. All the variables are 
calculated using the energy clusters of a jet in its center- 
of-mass frame and they are: 
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FIG. 1: Illustration of the constituent particle distribution of 
a jet. (a) Jet in the lab frame, (b) Jet of a boosted particle 
decaying to a two-body final state in the jet rest frame, (c) 
Jet of a boosted particle decaying to a three-body final state 
in the jet rest frame, (d) QCD jet in its rest frame. 



• Thrust: The thrust axis [3J, |35| of a jet in its 

center-of-mass frame, T, is defined as the direction 
which maximizes the sum of the longitudinal mo- 
menta of the energy clusters. The thrust, T, is 
related to this direction and is calculated as: 
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where pi is the momentum of each energy cluster 
in the jet rest frame. The allowed range of T is 
between 0.5 and 1, where T = 1 corresponds to a 
highly directional distribution of the energy clus- 
ters, and T = 0.5 corresponds to an isotropic dis- 
tribution. 

• Thrust minor: The thrust minor @, [35|, Tmin, is 
defined as: 



(2) 



^min = corresponds to a highly directional distri- 
bution of the energy clusters, and T = 0.5 corre- 
sponds to an isotropic distribution. 

• Sphericity: The sphericity tensor ^] is defined as: 



(3) 



where a and /3 correspond to the x, y and z com- 
ponents of the momentum of each energy cluster in 
the jet rest frame. By standard diagonalization of 
Qj^g three eigenvalues Ai > A2 > A3, 
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with A1+A2+A3 = 1. The sphericity is then defined 
as 

S=l{\2 + A3). (4) 

Sphericity is a measure of the summed squares of 
transverse momenta of ah the energy clusters with 
respect to the jet axis, and < S < 1. A jet 
with two back-to-back subjets in its rest frame has 
5* = 0, and S = 1 indicates an isotropic distribution 
of the energy clusters. 

• Aplanarity: The aplanarity [36] is defined as 

^=^, (.) 

and is constrained to the range < A < |. A 
highly directional distribution of the energy clus- 
ters has A = and A = 0.5 corresponds to an 
isotropic distribution. 

• Fox- Wolfram Moments: The Fox- Wolfram mo- 
ments [33, Hi^ are defined as 

m = J2^-^p,{cose.,), (6) 

where Oij is the opening angle between energy clus- 
ters i and E is the total energy of the clusters 
in the jet rest frame, the Pi{x) are the Legendre 
polynomials. Since the energy cluster is a massless 
pseudoparticle, Hq = 1. For a jet that has a struc- 
ture of two back- to-back subjets in its rest frame. 
Hi = 0^ Hi ^ 1 for even and Hi ^ for odd I. 
In our application, the ratio between the second- 
order and zeroth-order Fox- Wolfram moments, 7^2, 
is used as the discriminating variable. 

The distributions of the jet shape variables are shown 
in Figure [2] for W jet signal and QCD jet background. 
The shape variables of the jet signal show clearly a 
back-to-back two body topology, while those of the QCD 
jets indicate an isotropic-like distribution. They are very 
similar to the distributions of event shape variables ob- 
served in e+e~ T(4S') [33| that is at a much lower 
mass scale than the W boson. This indicates that the 
newly introduced shape variables in the jet rest frame 
indeed encapsulate properties of the jet substructure and 
are relatively independent of the particle mass scale. 

We compare our new shape variables to eccentric- 
ity [l3|. The jet eccentricity is a commonly used jet 
shape variable in the lab frame and is defined as the 
difference between the maximum and minimum value of 
variances of jet constituents along the principal and mi- 
nor axis, respectively. The distribution of the eccentric- 
ity is also shown in Figure [2l The new shape variables 
have comparable but slightly less background rejection 
power while keeping the same signal efficiencies in the 



large mass window 40 < mjet < 140 GeV, as shown in 
Figure [3l However, studies show that a variable calcu- 
lated in the jet rest frame has less correlation with the jet 
mass in the QCD background sample. As shown in Fig- 
ure O when we tighten the selection of shape variables to 
reject a large fraction of the QCD background, the eccen- 
tricity tends to reject more background events with low 
mjet and thus creates a significant kinematic enhance- 
ment near the W boson mass peak; this is not the case 
for some of the shape variables in the jet rest frame, such 
as thrust-minor, sphericity and aplanarity. While for the 
thrust, and R2, the kinematic enhancement is less signif- 
icant as that for eccentricity. Therefore, our proposed jet 
substructure method in the jet rest frame has an exper- 
imental advantage to separate the boosted W/Z bosons 
from the large QCD background. 



C. Reclustering 

Another important application of our proposed 
method is to recluster the energy clusters of a jet to re- 
construct subjets in the jet rest frame. We perform such 
a study using the Cambridge- Aachen (CA) sequential jet 
reconstruction algorithms with a modified distance pa- 
rameter of = 0.6, where is defined as the angle be- 
tween two pseudoparticles in the jet rest frame. We intro- 
duce several other discriminating variables: the fraction 
of energy carried by the first leading subjet {/ei)^ the 
fraction of energy carried by the second leading subjet 
{fE2)^ the asymmetry of the energy {Ae) that is defined 
as Ae = if El - fE2)/{fEi + fe), and the opening angle 
(A0) between the two leading subjets. Their distribu- 
tions are shown in Figure IH Studies show that most W 
jets have back- to-back subjets whose energies are around 
half of the W boson mass, while those distributions from 
QCD jets are irregular. 

We compare the subjet information in the jet rest 
frame to the mass drop /i and splitting the two 
commonly used variables by the existing two-body sub- 
jet methods, such as YSplitter [1] and mass-drop tag- 
ger jl8|. We first recluster the reconstructed jets us- 
ing the algorithm. The last step of the clustering 
is then undone: j ji,j2, with rnj^ > rrij^. The 
mass drop and splitting are defined as /i = mjjrrij^ and 
y = min(p|. 7 Pt j2 j2 / • shown in Figure [5l 
the variables constructed using subjets in the jet rest 
frames have similar background rejection power while 
keeping the same signal efficiencies. We further study 
the correlation between the jet mass and the variables 
constructed using subjets and find that the traditional 
variables /i and y tend to reject more QCD jets with the 
small mjet and thus creates a significant kinematic en- 
hancement near the W boson mass peak; this is also the 
case for /ei^ fE2 and AO, although their enhancements 
are relatively smaller. On the other hand, no such en- 
hancement is observed for the variable Ae, as shown in 
FigureO We find that the variables /ei, fE2 and AO are 
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FIG. 2: The distributions of the jet shape variables: (a) Thrust, (b) Thrust-minor, (c) Sphericity, (d) Aplanarity, (e) R2 and 
(f) Eccentricity for the W jet signal and QCD jet background. The eccentricity is a commonly used shape variable in the lab 
frame and is shown here for the purpose of comparison. All the distributions are normalized to unity. 
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FIG. 3: (a) The signal efficiency of W jets vs. the background rejection of QCD jets for jet shape variables in the mass 
window 40 < mjet < 140 GeV. (b) The invariant mass distributions of QCD jets after 90% of them are rejected by a selection 
requirement based solely on one of the shape varialbles: Thrust, Thrust-minor, Sphericity, Aplanarity, R2 and Eccentricity. All 
the distributions are normalized to unity. 




FIG. 4: The distributions of the kinematics of the reconstructed subjets from reclustering in the jet rest frame: (a) fraction of 
energy carried by the first leading jet, (b) fraction of energy carried by the second leading jet, (c) the asymmetry of the energies 
carried by the first and second leading jets and (d) the opening angle between the two leading jets. All the distributions are 
normalized to unity. 
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FIG. 5: (a) The signal efficiency of W jets vs. tlie background rejection of QCD jets for jet substructure variables in the mass 
window 40 < mjet < 140 GeV. (b) The invariant mass distributions of QCD jets after 90% of them are rejected by a selection 
requirement based solely on one of the shape variables: fsi, fE2, Ae, AG, /i and y. All the distributions are normalized to 
unity. 



highly correlated with the shape variables we introduced 
before. However, the correlations between Ae and those 
variables are fairly small (less than 20% ). Thus we can 
add additional discriminating power by combining them 
using multivariable analysis techniques, such as neural 
networks, boosted decision trees, etc. 

We also point out that the rest frame sub jet algorithm 
is infrared and collinear safe if an infrared and collinear 
safe jet algorithm is used for the rest frame sub jet clus- 
tering. All the sophisticated jet-grooming algorithms in- 
troduced in the lab frame, such as pruning [3, 4] and 
trimming can be easily incorporated. The leading 
sub jets in the jet rest frame are not much affected by 
the underlying event and pileup. We repeat our studies 
by generating MC events with different average numbers 
of multiple interactions and observe no significant differ- 
ence in the performance of the jet substructure in the 
center-of-mass frame. 



IV. APPLICATION 

As a first step, we consider the possibility of identi- 
fying boosted W/Z bosons in current LHC data. In 
order to suppress the large QCD background, we con- 
struct a likelihood variable using the shape variables in 
the jet rest frame: Thrust-Minor, Sphericity, Aplanarity 
and Ae. We optimize the selection cut on the likelihood 
variable by maximizing S/VB, where S and B are the 
number of the signal and background events in the sig- 
nal mass window 50 < mjet < 115 GeV. The jet mass 
distributions of the VF/Z+jet and QCD jet in MC event 
samples are shown in Figure [6l After applying the op- 
timized selection cut on the likelihood, we reject more 



than 95% of the QCD background while keeping approx- 
imately 30% of the signal. The significance of S/ \fB in 
the signal window is more than 13. Notice that here we 
treat both boosted W and Z bosons as signal because the 
jet mass resolution is larger than their mass difference. 
With enough data, their individual contributions can be 
extracted by fitting the signal mass distribution. 

While our study is based on MC simulated events and 
the results could be somewhat optimistic, we point out 
that we have not yet used all the available jet substruc- 
ture variables. More sophisticated multivariable analysis 
techniques such as neural network or boosted decision 
trees will further compensate for any potential underes- 
timate of the background. As a result, we expect to es- 
tablish the signal of hadronically decaying WjZ jets and 
measure their inclusive production cross section with cur- 
rent LHC data. Such a measurement is not only a pre- 
requisite of any NP search using boosted WjZ bosons, 
but also a model-independent test of the SM. Any excess 
of boosted W and Z bosons will be a promising hint of 
the existence of NP. 

Reconstructed decays of VF's and Z's to jets can be 
used to search for NP with specific final state signa- 
tures. Here we demonstrate such applications by con- 
sidering a heavy resonance that decays to a WW final 
state: ^ X ^ WW, where the X is a new heavy 
resonance beyond the SM, such as a new heavy gauge bo- 
son, or a Kaluza-Klein Z' in the Randall- Sundrum (RS) 
model, etc. We consider a search for an X signal by 
fully reconstructing the X signal candidate in the de- 
cay mode where one W boson decays leptonically and 
the other one decays hadronically. Note that for such 
high mass (~ 1 TeV) resonance decays, more than 90% 
of the events have the two quarks from the W boson 
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FIG. 6: Invariant mass of the W/Z jet candidates in the MC simulated event sample that is equivalent to 5fb ^ of LHC data 
at 7 TeV center-of-mass energy: (a) before the likelihood cut. (b) after the likelihood cut. 



decay within a cone of AR < 0.4. This makes it very dif- 
ficult to identify two separate jets. As a result, we select 
the leading jet with > 300 GeV and \rj\ < 2.5 in an 
event as the hadronically decaying W boson candidate. 
The jet is reconstructed with the anti-Zcx algorithm with 
a distance parameter of AR = 0.6. The leptonically- 
decaying W boson is reconstructed by requiring one iso- 
lated lepton with > 20 GeV, |7^| < 2.5 and more than 
25 GeV of missing transverse energy in the event. The 
presence of only one neutrino in the final state allows for 
the reconstruction of its momentum by requiring trans- 
verse momentum conservation and applying the W boson 
mass constraint. In doing so, we obtain two solutions of 
the neutrino p^, which leads to two reconstructed WW 
masses. Studies show that the difference between the two 
reconstructed masses is small so we take the minimum 
of the two as the reconstructed mass of the resonance 
X (^x)- The major SM backgrounds are the produc- 
tion of VK+jets, WW, and tt. In order to reduce the 
background, we exphcitly identify the boosted W jets 
by requiring their jet mass to be within 20 GeV of the 
W boson mass, which is slightly more than twice that 
of the expected W jet mass resolution in the MC sim- 
ulation. We also apply W jet identification (W ID), a 
selection on the likelihood variable as described before, 
to reject more than half of the QCD jets while keeping 
more than 80% of the signal. The invariant mass distri- 
butions of the X WW candidates in the MC simulated 
event sample that is equivalent to 5fb~^ of LHC data 
at 7 TeV center-of-mass energy are shown in Figure 
We estimate the expected 95% C.L. upper limit on the 



product of the production cross section of a heavy reso- 
nance X and a branching fraction for its decay into WW 
pair. The expected limit for 5fb~^ of LHC data at 7 TeV 
center-of-mass energy is plotted as a function of the as- 
sumed X mass, as shown in Figure [H For comparison, 
we also plot the expected 95% upper limit without ex- 
plicit W jet identification. It is clear that the boosted W 
jet technique can significantly improve our experimental 
sensitivity. The above discussion can be directly applied 
to searches for heavy resonances that decay to other di- 
boson final states, such as X ^ ZZ/WZ. 



V. CONCLUSION 

In this paper we introduce a new approach to study jet 
substructure in the center-of-mass frame of the jet. We 
demonstrate that it can be used to discriminate boosted 
heavy particles from QCD jets. The method suggested 
in this paper is a proof of concept and is complemen- 
tary to the existing algorithms to identify boosted heavy 
particles based on jet substructure. 
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